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c/2 ■ Abstract 

O 

We first introduce the Hamming distance between two strings. 

pg ■ Then, we apply this concept to the representations of whole numbers in 

£> ■ base n for all positive integers n > 2. We claim that a simple formula 

*- . \ exists for the sum of all Hamming distances between pairs of consec- 

l/-j ' utive integers from 1 to m, which we will derive. We also state and 

prove other interesting results concerning the aforementioned topic. 

co 

O 

1 Introduction 

The remainder of this paper is organized as follows : Section 2 provides an 
introduction to the Hamming distance and its applications. Some new results 
are stated with proof in Section 3. Section 4 gives a Python implementation 
of the algorithm developed.. It also carries programs in Python for the same. 
Finally, Section 5 concludes the paper. 

2 Hamming Distance 

The Hamming distance is named after Richard Hamming. Denned between 
two strings of equal length, the Hamming distance gives the number of po- 
sitions at which corresponding symbols differ. Alternatively, it measures the 
minimum number of substitutions required to change one string into the 



other, or the number of errors in transmission, not including insertions and 
deletions, required to transform one string into the other. In this paper, we 
will use the notation H(S\, £2) to denote the Hamming distance between two 
strings of equal length - £1 and £2. 

For example, the H (math, mats) = 1 while H(math, math) = 0. Hamming 
distance is not defined between two strings of unequal length. 

The following Python function calculates the Hamming distance, taking two 
strings of equal length as arguments : 

def hammingdistance(sl, s2) : 
assert len(sl) == len(s2) 
return sum(chl != ch2 for chl, ch2 in zip (si, s2)) 

3 Results 

Let jigN and n > 2. We consider the representations of whole numbers in 
base n. We append zeros to the left of the numerals in case the strings are 
of unequal length. This happens between m — 1 and m if and only if m is a 
power of n. 

Lemma : H(m, m — 1) = / + 1 where / is the exponent of the highest power 
of n that divides m. 

Proof : We prove this by considering 3 cases. 

Case 1 : m = n l for some I G N 

Here, m = (100...000) n (1 followed by / zeroes). Let A be the digit in base n 
that represents n — 1. So, m— 1 — AAA. ..AAA (/ A digits). 
Clearly, H(m, m — 1) = I + 1 and we are done. 

Case 2 : m = pn q for some gGN where q > 1 and (p, n) = 1. Let, 

m = n k a k + n k ~ l dk-i + ... + n 2 a 2 + n l ai + n°a (1) 

Trivially, all a, for j < q will be zeros. 



And, 



m - 1 = n k b k + n k ~ 1 b k -i + ... + n 2 b 2 + n 1 ^ + n% (2) 



Here, a = as n divides to. 

Now, 771 — 1 will have all 6j = (n — 1) for j < q and b q = a q — 1. For i > q, 

on the other hand, a^ = 6j. 

Thus, H(m, m — 1) — q + 1 and we are done. 

Case 3 : n does not divide m. Let 

771 = n fc afc + n fc_1 afc_i + ... + n 2 ci2 + n^ai + n°a (3) 



k 
m = 

i=0 

Similarly, 



E n<a * ^ 



m - 1 = n fe 6 fc + n fc l b k _ x + ... + n 2 6 2 + n l bi + n°6 (5) 

k 

m — 1 = \^n l bi (6) 

i=0 

Since n does not divide to, ao > 1. Therefore 6o, the remainder of to — 1 

upon division of n, is simply ao — 1, and bi = ai for all % > 1. Therefore 

H(m, to — 1) = 1 and we are done. ■ 



Now that we have found a way to calculate the Hamming distance between 
two consecutive numbers in base n, we will derive a formula for the sum of 
all such Hamming Distances up to a given to. Let 

S(m) = if (0, 1) + ff (1, 2) + if (2, 3)... + H(k -2,k-l) + H(k- 1, k) (7) 

fe-i 
5(to) = J2 H & i + 1) (8) 

i=0 

From the lemma, we know that H(m, m — 1) = / + 1 where / is the exponent 
of the highest power of n that divides to. 



Thus, 

m 

S = E( P * + !) ( 9 ) 

Where Pj represents the highest power of n contained in i. Trivially, the least 
value that P$ + 1 will ever attain for any % is 1. Using Iverson Brackets, 

m oo 
i=l j=0 

which we can then rewrite as, by exchanging the order of summation, 

oo m 

%) = EEi nJ w ( n ) 

i=0 i=l 

But there are [m/n J ] integers from 1 ton that divide n\ so 

oo 

S(m)=^rm/n J 'l (12) 

3=0 

Finally, we only need to take the sum up to |~log n m] , since m < n? for 
j > log„ m. 

And we are done. ■ 

4 Implementation 

The following Python code finds the sum of Hamming distances in base n 
upto a given m : 

i = int(raw_input( 'Enter a number : ')) 
n = int(raw_input(' Enter a base : ' )) 

i = int(i, n) 

s = 
x = 



def hamming_distance(sl, s2) : 
assert len(sl) == len(s2) 
return sum(chl != ch2 for chl, ch2 in zip (si, s2)) 



while (x < i) : 

si = baselOtoN(x, n) 
s2 = baselOtoN(x+l, n) 
if (len(sl) != Ien(s2)): 

si = '0' + si 
s = s + hamming_distance(sl, s2) 
x = x + 1 

print s 

The following Python code prints the sum of all Hamming distances from 
to m in base n using the results proved in this paper : 

m = int(raw_input( 'Enter a number : ')) 
n = int(raw_input( 'Enter a base : ')) 

J = i 
h = 

while (j > 1) : 

J = J / n 
h = h + 1 

s = 

while (h >= 0) : 

s = s + ( i / (n ** h) ) 

h = h - 1 

print s 



5 Conclusions 

We conclude that the sum of the Hamming distances of consecutive integers 
is given by a simple formula and can be computed easily and efficiently 
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